Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge ReadableByteStream into ReadableStream #430

Closed
wants to merge 1 commit into from

Conversation

tyoshino
Copy link
Member

This commit includes the following design changes:

  • sync the spec text with the new reference implementation
  • Finalize pull/pullInto behavior #423
    • add byobRequest getter to the controller
  • port the asserts in the non-byte source version to the byte source version
  • remove auto release feature from the byte source version
  • rename Byob to BYOB
  • move TransferArrayBuffer to helpers.js
  • make the default highWaterMark of the byte source version to 0
  • port the functionality that the start method can delay pulling by returning a pending Promise to the byte source version
  • add [[disturbed]] feature to the byte source version
  • port highWaterMark mechanism to ReadableByteStreamController
  • merge ReadableStream and ReadableByteStream
    • now it behaves differently based on whether the given source has pullInto or not
    • bunch of renaming for merge
  • merge ReadableStreamReader and ReadableByteStreamReader
  • Revisiting what's reasonable return value for each method of the controller #424
    • make controller.close() and controller.enqueue() fail when the stream is not in the readable state
    • make controller.enqueue() throw a predefined TypeError, not the stored error
    • as a result test/pipe-to-options.js, readable-streams/general.js and test/readable-stream-templated.js have been updated
  • rename ReadableStreamController to ReadableStreamDefaultController
  • rename ReadableStreamReader to ReadableStreamDefaultReader
  • rename ReadableByteStreamController to ReadableStreamBYOBController
  • add automatic buffer allocation feature to BYOB controller
  • read(view) now checks view.byteLength before setting [[disturbed]]

@tyoshino
Copy link
Member Author

Spec text update WIP. It shouldn't render by bikeshed yet.

@domenic
Copy link
Member

domenic commented Mar 3, 2016

Yay, this is really exciting!

I added some commits of my own to this branch. I think it'll be better for us to work together than to just do review-ping-pong every day.

Here are some additional potential changes I'd like your opinion on:

  • _invalidate has to go away as right now it's kind of specced to be a public property (although its call sites access it using @[[ notation).
    • I moved _cancel and _pull to be "internal methods".
    • You could do a similar thing for _invalidate, but there's no need for polymorphic dispatch, so:
    • Either it should become an abstract operation, or...
    • We should consider moving as many abstract operations as possible to become internal methods.
      • This would be nice because it would remove the giant prefixes like "ReadableStreamDefaultController".
      • However the autolinking would become harder (so it would be broken for a while). In general you can't autolink to a method, but all of our methods except [[Cancel]] and [[Pull]] would be non-polymorphic, so we could in theory write some custom spec processors that makes them link.
      • Maybe it is best to just stick with abstract operations for the non-polymorphic case?
  • ReadableStreamBYOBControllerRespondInReadableStat seems unused?

@@ -3,21 +3,34 @@ const test = require('tape-catch');
// Many other pipeTo-with-options tests have been templated.

test('Piping with no options and a destination error', t => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did these tests change? (We'll want to summarize any test changes in the commit message since they reflect potential observable behavior changes.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I see. It's all because of #424.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I've added a line explaining why this change happened in the OP of the PR

@domenic
Copy link
Member

domenic commented Mar 3, 2016

I added some examples to the appendix. Please review them for correctness. Especially the socket one, as I am not sure that socket.setTCPWindowSize(0) is the correct way to signal backpressure, or if socket.setTCPWindowSize is a realistic API. (I tried to look through tcp(7) man pages but there was no straightforward thing saying "here is how you set the TCP window size", so maybe we should use a more complicated example API.) Is this at all how things might be implemented under the hood in Blink, @yutakahirano?

@yutakahirano
Copy link
Member

let bytesRead;
if (r) {
  bytesRead = socket.readInto(r.view.buffer, r.view.byteOffset, r.view.byteLength);
  r.respond();
} else {
  const buffer = new ArrayBuffer(DEFAULT_WINDOW_SIZE);
  bytesRead = socket.readInto(buffer, 0, DEFAULT_WINDOW_SIZE);
  controller.enqueue(buffer);
}

bytesRead is not used, is it correct?

@domenic
Copy link
Member

domenic commented Mar 4, 2016

Ah, it is supposed to be the argument to respond(). So it doesn't need to be assigned in the second branch

@yutakahirano
Copy link
Member

In the second branch, you need to set the buffer's size to readBytes, I think.

@yutakahirano
Copy link
Member

Regarding backpressure:

Is it possible to not read when desiredSize <= 0? I'm not sure how precisely you need to manage the total buffer size, but when you stops reading from the socket the system will stop reading from the network and setTCPWindowSize(0) is not needed, IIUC.

@domenic
Copy link
Member

domenic commented Mar 4, 2016

Is it possible to not read when desiredSize <= 0? I'm not sure how precisely you need to manage the total buffer size, but when you stops reading from the socket the system will stop reading from the network and setTCPWindowSize(0) is not needed, IIUC.

The problem is that the kernel queue (aka kernel buffer) could overflow. The point of the enqueue() system is to move things into a userspace queue to avoid that.

@yutakahirano
Copy link
Member

The problem is that the kernel queue (aka kernel buffer) could overflow.

So you want to minimize the kernel buffer usage when the consumer doesn't want the data, right?

@domenic
Copy link
Member

domenic commented Mar 4, 2016

Yeah, that's the idea.

@yutakahirano
Copy link
Member

I'm sorry I don't have enough knowledge about how Chrome uses a TCP socket. TCPSocketPosix::SetReceiveBufferSize doesn't look to be used for such purpose, but I may be wrong. You will find far better reviewers in net-dev@chromium.org.

@tyoshino
Copy link
Member Author

tyoshino commented Mar 8, 2016

I added some commits of my own to this branch. I think it'll be better for us to work together than to just do review-ping-pong every day.

OK!

  • _invalidate has to go away as right now it's kind of specced to be a public property (although its call sites access it using @[[ notation).

Fixed

  • I moved _cancel and _pull to be "internal methods".

Sounds good. Thanks

  • You could do a similar thing for _invalidate, but there's no need for polymorphic dispatch, so:
  • Either it should become an abstract operation, or...
  • We should consider moving as many abstract operations as possible to become internal methods.
    • This would be nice because it would remove the giant prefixes like "ReadableStreamDefaultController".
    • However the autolinking would become harder (so it would be broken for a while). In general you can't autolink to a method, but all of our methods except [[Cancel]] and [[Pull]] would be non-polymorphic, so we could in theory write some custom spec processors that makes them link.
    • Maybe it is best to just stick with abstract operations for the non-polymorphic case?

Ah, I see. OK. I'll make it an abstract operation.

  • ReadableStreamBYOBControllerRespondInReadableStat seems unused?

Fixed

@tyoshino
Copy link
Member Author

tyoshino commented Mar 8, 2016

In Linux (I'm reading 3.13), the size of window to advertise is determined from sk_rcvbuf (using tcp_adv_win_scale). By default, 3/4 of free space of sk_rcvbuf is advertised. The free space is calculated by deducting sk_rmem_alloc from sk_rcvbuf.

sk_rmem_alloc represents the number of bytes received (+overhead) but not yet consumed by reading out to the user land. It increases when skb_set_owner_r() is called on delivery of data to TCP layer. skb_set_owner_r() also sets sock_rfree() to the sk_buff's destructor which decreases sk_rmem_alloc. tcp_read_sock() invokes this sock_rfree() by calling sk_eat_skb() on finished sk_buff. tcp_read_sock() is invoked on tcp_recvmsg().

When sending data or ack only packet, __tcp_select_window() is called to calculate the up-to-date window size to advertise in the packet. The calculation explained above happens in this function.

TCP_WINDOW_CLAMP socket option bounds window. But it seems only when window scaling is not turned on. See tcp_sock's window_clamp.

SO_RCVBUF socket option sets sk_rcvbuf. See SO_RCVBUF handling code in net/core/sock.c.

These values cannot be less than SOCK_MIN_RCVBUF / 2. But it should be ok. It's small, and while read(2) doesn't happen the buffer gets filled to make sk_rmem_alloc hit sk_rcvbuf and let __tcp_select_window() decrease the advertised window.

When tcp_moderate_rcvbuf sysctl option is enabled, tcp_rcv_space_adjust() may automatically increase the space every time when read(2) consumes data from the kernel buffer. This is done based on the amount copied to user space in the last RTT and some strategy (considering memory pressure, etc.).

Setting SO_RCVBUF disables this automatic increase (see that SOCK_RCVBUF_LOCK is set to sk->userlocks) and reset the buffer size limit (sk_rcvbuf) to the given value.

So, IIUC, TCPSocketPosix::SetReceiveBufferSize() would help us prevent / recover from unexpected increase of amount of data being buffered in kernel.

@tyoshino
Copy link
Member Author

tyoshino commented Mar 9, 2016

Sorry for spamming. Batching into one comment.

55eb984 lgtm
0aa9a98 lgtm committed a typo fix.
a465cf4 lgtm
8a8ba18 lgtm
8a8ba18 lgtm
beb60c1 lgtm

@tyoshino
Copy link
Member Author

tyoshino commented Mar 9, 2016

Self-reply to #430 (comment)

So, IIUC, TCPSocketPosix::SetReceiveBufferSize() would help us prevent / recover from unexpected increase of amount of data being buffered in kernel.

SO_RCVBUF limits total memory consumption including overhead. So, this value should be set to a slightly larger value (e.g. like tcp_adv_win_scale is reflected to the advertised window size).

@tyoshino
Copy link
Member Author

tyoshino commented Mar 9, 2016

Done reviewing

@domenic
Copy link
Member

domenic commented Mar 9, 2016

Just noticed: IsReadableStreamBYOBRequest is not defined, and a brand check is missing in the view getter.

@domenic
Copy link
Member

domenic commented Mar 9, 2016

Sorry for the spam. I will fix the things I commented on except the examples. (That is: adding IsReadableStreamBYOBRequest, adding the missing brand check, moving the BYOB request abstract ops, and the syntax in the .js files.)

@domenic
Copy link
Member

domenic commented Mar 9, 2016

Another issue: there is a mixture of spec internal types and JS types in the ReadableStreamBYOBRequest constructor. It cannot take a Record as its second parameter.

I'd suggest passing the view directly and letting the caller construct it.

@tyoshino
Copy link
Member Author

The automatic buffer allocation feature discussed at #430 (comment) has been added. See the newly added test.

Before making the change for the feature, I made some refactoring changes. The changes are separated into small commits, so should be easy to review.

@tyoshino
Copy link
Member Author

Oh, it looks like "Readable Stream BYOB Controller Abstract Operations" is not alphabetized.

Fixed all by 89e2d31

@tyoshino
Copy link
Member Author

That is, the main separation is that underlyingSource contains properties relevant to the stream creator, whereas queuingStrategy is something the consumer might be interested in setting.

Ah, OK. That makes sense!

@tyoshino
Copy link
Member Author

Updated the example to use the automatic buffer allocation feature.

@@ -1729,6 +1739,11 @@ table:
1. Set *this*@[[totalQueuedBytes]] to *0*.
1. Set *this*@[[started]], and *this*@[[closeRequested]] to *false*.
1. Set *this*@[[strategyHWM]] to ValidateAndNormalizeHighWaterMark(_highWaterMark_).
1. Let _autoAllocateChunkSize_ be GetV(_underlyingByteSource_, `"autoAllocateChunkSize"`).
1. If _autoAllocateChunkSize_ is not *undefined*,
1. If Number.isInteger(_autoAllocateChunkSize_) is *true* or _autoAllocateChunkSize_ &lt; 0,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't look right. (And we shouldn't reference public overridable API functions like Number.isInteger.)

I think instead what we want to do is:

  1. Set autoAllocateChunkSize to ToInteger(autoAllocateChunkSize).
  2. ReturnIfAbrupt(autoAllocateChunkSize).
  3. If autoAllocateChunkSize ≤ 0, or if autoAllocateChunkSize is +∞ or -∞, throw a RangeError exception.

(NaN is automatically converted to 0 so will still throw.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. Thanks for fixing!

@domenic
Copy link
Member

domenic commented Mar 22, 2016

Sorry for the delay in getting back to you on this. I made some more tweaks to auto-allocate based on my review comments.

I think we should work on merging this soon. Do you think it's ready? Would you mind drafting up a commit message? The main things I want to be sure are included are:

  • How this differs from the previous ReadableByteStream separate class model
  • Details on the normative changes to existing ReadableStream behavior, including reference to why the tests were changed
  • A vague idea of how the under-the-hood changes work (i.e., moving a lot of things to the controllers)

After we get this merged I would like to do an editorial pass on my own through the document. Largely it will consist of adding more cross-links (both within-spec and to other specs) and also updating to use the latest ES conventions (which allow us to use ? and ! prefixes and get rid of ReturnIfAbrupt).

@tyoshino
Copy link
Member Author

Thanks for fixing. The commit basically looks good. See the replies to comments.

@tyoshino
Copy link
Member Author

Here's a draft of the commit log:


Merge ReadableByteStream into ReadableStream

Background

We originally designed the ReadableByteStream which includes extended features to handle bytes as a separate class from the ReadableStream which handles a stream of general objects.

While designing the details of ReadableByteStream (see #361), we noticed that we could simplify the ReadableStream and ReadableByteStream by moving variables and logic for handling queuing into their controller class (see #379). This turned out to be also clarifying which part of the logic is representing semantic requirement of readable streams and which part of that is implementing helper logic for easing development of underlying sources.

After the above refactoring, we also noticed that ReadableStream and ReadableByteStream share most of their code. So, we attempted to merge the ReadableByteStream into the ReadableStream.

Change summary

The new ReadableStream has two reader getter methods when BYOB reading is available. Availability of BYOB reading is determined by whether or not the underlying source passed to the ReadableStream had BYOB pulling functionality. This is indicated by the "byob" parameter of the ReadableStream constructor.

The two readers are named the default reader and BYOB reader. The default reader's read() method takes no argument. The resulting promise of the read() method will be fulfilled by a generated chunk. The BYOB reader's read() method takes one argument. It must be an ArrayBufferView. The ReadableStream will fill the passed ArrayBufferView and fulfills the returned promise with it. The ArrayBufferView might be transferred several times.

When the byob parameter is false, the underlying source is given a ReadableStreamDefaultController on start() method call. This class provides methods to enqueue a new chunk and know the status of the queue. Its queuing strategy is configured by the parameters passed to the ReadableStream's constructor. The underlying source can subscribe to events on which chunks are drained from the queue by implementing the "pull" method on it. Only the getReader() method will be available on the ReadableStream.

When the byob parameter is false, the underlying source is given a ReadableStreamBYOBController on start() method call. In addition to the functionalities of the ReadableStreamDefaultController, this controller provides a getter named byobRequest() which exposes the oldest outstanding BYOB reading request into which the underlying source can put bytes directly (see #423). Both the getReader() and the getBYOBReader() method will be available on the ReadableStream.

The ReadableStreamBYOBController can be configured to convert read requests from a default reader into BYOB read request by allocating a buffer and exposing it at the byobRequest getter automatically. This ease implementation of a reactive underlying source.

Changes included in this commit

In addition to the major changes as described above, this commit includes bunch of design / logic / aesthetic changes as follows:

  • remove auto release feature from the ReadableByteStream
  • rename Byob to BYOB
  • move TransferArrayBuffer to helpers.js
  • make the default highWaterMark of the byte source version to 0
  • port the functionality that the start method can delay pulling by returning a pending Promise to the ReadableStreamBYOBController
  • implement [[disturbed]] feature to the byte handling logic
  • port highWaterMark mechanism to ReadableByteStreamController
  • changes on semantics of the controller methods (see Revisiting what's reasonable return value for each method of the controller #424)
    • make controller.close() and controller.enqueue() fail when the stream is not in the readable state
    • make controller.enqueue() throw a predefined TypeError, not the stored error
    • as a result test/pipe-to-options.js, readable-streams/general.js and test/readable-stream-templated.js have been updated
  • rename ReadableStreamController to ReadableStreamDefaultController
  • rename ReadableStreamReader to ReadableStreamDefaultReader
  • rename ReadableByteStreamController to ReadableStreamBYOBController
  • read(view) now checks view.byteLength before setting [[disturbed]]

@tyoshino
Copy link
Member Author

Feel free to edit, Domenic.

## Background

We originally designed ReadableByteStream, which included extended features to handle bytes, as a separate class from the ReadableStream, which handles a stream of general objects.

While designing the details of ReadableByteStream (see #361), we noticed that we could simplify ReadableStream and ReadableByteStream by moving variables and logic for handling queuing into their controller class (see #379). This turned out to also clarify which part of the logic represents semantic requirements of readable streams, and which part of it is implementing helper logic for easier development of underlying sources.

After the above refactoring, we also noticed that ReadableStream and ReadableByteStream share most of their code. So, we merged the ReadableByteStream class into ReadableStream. This has many benefits for developers who don't have to deal with two similar-but-different classes. Instead, the same class is used, with the behavior customized by the underlying source.

## Change summary

The new ReadableStream class has two reader acquisition methods, getReader() and getBYOBReader(), with the latter working when BYOB reading is available. Availability of BYOB reading is determined by whether or not the underlying source passed to the ReadableStream had BYOB pulling functionality. This is indicated by a truthy `byob` property of the underlying source.

The two readers are named the default reader and BYOB reader. The default reader's read() method takes no argument. The resulting promise of the read() method will be fulfilled with a newly-allocated chunk. The BYOB reader's read() method takes one argument; it must be an ArrayBuffer view. The ReadableStream will fill the passed ArrayBuffer view, and fulfill the returned promise with it. The ArrayBuffer view might be transferred several times, but the same backing memory is always written to.

When the byob option is falsy, the underlying source is given a ReadableStreamDefaultController. This class provides methods to enqueue a new chunk and know the status of the queue. Its queuing strategy is configured by the parameters passed to the ReadableStream's constructor. The underlying source can subscribe to know when chunks are drained from the queue by implementing the "pull" method. Only the getReader() method will be functional on the ReadableStream.

When the byob option is truthy, the underlying source is given a ReadableStreamBYOBController. In addition to the functionalities of the ReadableStreamDefaultController, this controller provides a getter named `byobRequest` which exposes the oldest outstanding BYOB reading request into which the underlying source can put bytes directly (see #423). Both the getReader() and the getBYOBReader() method will be functional on the ReadableStream.

The ReadableStreamBYOBController can be configured to convert read requests from a default reader into BYOB read requests, by automatically allocating a buffer and exposing it via the byobRequest getter. This eases implementation of a reactive underlying source, as shown in one of the new examples.

## Changes included in this commit

In addition to the major changes as described above, this commit includes bunch of design /logic/aesthetic changes as follows:

### Changes to existing observable features

Although ReadableStream's internals were refactored immensely, its external behavior (when not providing a BYOB source) is almost identical to before, as verified by our extensive unit tests. However, we did make a few changes which are observable:

- Changes to the semantics of the controller methods (see #424):
  - Make controller.close() and controller.enqueue() fail when the stream is not in the readable state
  - Make controller.enqueue() throw a predefined TypeError, not the stored error
  - (As a result of these changes, the tests test/pipe-to-options.js, test/readable-streams/general.js, and test/readable-stream-templated.js have been updated.)
- Rename ReadableStreamController to ReadableStreamDefaultController
- Rename ReadableStreamReader to ReadableStreamDefaultReader

### Changes to byte streams

As explained above, byte streams were changed in fairly extensive ways to merge them into the base ReadableStream class. Here we call out a few notable changes from the previous specification text:

- Remove auto release feature from the ReadableByteStream
- Rename Byob to BYOB
- Make the default highWaterMark of the byte source version to 0
- Port the functionality that the start method can delay pulling by returning a pending promise to the ReadableStreamBYOBController
- Port the highWaterMark mechanism to ReadableByteStreamController
- Rename ReadableByteStreamController to ReadableStreamBYOBController
- Correctly update the [[disturbed]] slot in the byte handling logic
- read(view) now checks view.byteLength before setting [[disturbed]]
@domenic
Copy link
Member

domenic commented Mar 25, 2016

OK! I have rebased and squashed and updated the commit message with my tweaks. I'll let you do the honors of any final review, and of merging!

@tyoshino
Copy link
Member Author

Closed by e601d69

Thanks!!

@tyoshino tyoshino closed this Mar 28, 2016
@tyoshino tyoshino deleted the rsrbsmergesquash branch March 28, 2016 12:54
isonmad pushed a commit to isonmad/streams that referenced this pull request Nov 3, 2016
autoAllocateSize was added in whatwg#430 as commit e601d69.

The spec was out of sync with the reference implementation: it tried to round
non-integers into integers instead of rejecting them.

The reference implementation was out of sync with the spec: it was allowing
an autoAllocateChunkSize of 0 through instead of throwing a RangeError.

The toInteger helper was added in 200b54b
and its only user was removed in commit 5b47faa.
isonmad pushed a commit to isonmad/streams that referenced this pull request Nov 3, 2016
autoAllocateChunkSize was added in whatwg#430 as commit e601d69.

The spec was out of sync with the reference implementation: it tried to round
non-integers into integers instead of rejecting them.

The reference implementation was out of sync with the spec: it was allowing
an autoAllocateChunkSize of 0 through instead of throwing a RangeError.

The toInteger helper was added in 200b54b
and its only user was removed in commit 5b47faa.
isonmad pushed a commit to isonmad/streams that referenced this pull request Nov 3, 2016
autoAllocateChunkSize was added in whatwg#430 as commit e601d69.

The spec was out of sync with the reference implementation: it tried to round
non-integers into integers instead of rejecting them.

The reference implementation was out of sync with the spec: it was allowing
an autoAllocateChunkSize of 0 through instead of throwing a RangeError.
domenic pushed a commit that referenced this pull request Nov 3, 2016
autoAllocateChunkSize was added in #430 as commit e601d69.

The spec was out of sync with the reference implementation: it tried to round non-integers into integers instead of rejecting them.

The reference implementation was out of sync with the spec: it was allowing an autoAllocateChunkSize of 0 through instead of throwing a RangeError.
@nidhijaju nidhijaju mentioned this pull request Oct 29, 2020
1 task
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

None yet

3 participants